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Abstract 

The concept of a temporal phylogenetic network is a mathematical model of evolution of 
a family of natural languages. It takes into account the fact that languages can trade their 
characteristics with each other when linguistic communities are in contact, and also that 
a contact is only possible when the languages are spoken at the same time. We show how 
computational methods of answer set programming and constraint logic programming can 
be used to generate plausible conjectures about contacts between prehistoric linguistic 
communities, and illustrate our approach by applying it to the evolutionary history of 
Indo-European languages. 

KEYWORDS: phylogenetics, historical linguistics, Indo-European languages, answer set 
programming, constraint logic programming 



1 Introduction 

The evolutionary history of families of natural languages is a major topic of research 
in historical linguistics. It is also of interest to archaeologists, human geneticists, 
and physical anthropologists. In this paper we show how this work can benefit from 
the use of computational methods of logic programming. 

Our starting point here is the mathematical model of evolution of natural lan- 
guages introduced in ( |Ringe et al. 2002| ) and l|Nakhleh et al. 2005| . As proposed in 
HErdem et al. 2003jl . we describe the evolution of languages in a declarative formal- 
ism and generate conjectures about the evolution of Indo-European languages using 
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Fig. 1. A temporal phylogeny (a), and a perfect temporal network (b) with a lateral 
edge connecting B\ with D\. 

an answer set programming system. Instead of the system SMODELS,^ employed in 
earlier experiments, we make use of the new system CMODELS,^ which leads in 
this case to much better computation times. Our main conceptual contribution is 
extending the definition of a phylogenetic network from IjNakhleh et al. 2005|l by 
explicit temporal information about extinct languages — by estimates of the dates 
when those languages could be spoken. Computationally, to accommodate this en- 
hancement we divide the work between two systems: CMODELS and the constraint 
logic programming system ECL'ps'^ (http : //www-icparc . doc . ic. ac .uk/eclip se/J 

It was observed long ago (see, for instance, (|Gleason 1959|l ) that if linguistic 
communities do not remain in effective contact as their languages diverge then the 
evolutionary history of their language family can be modeled as a phylogeny — a tree 
whose edges represent genetic relationships between languages.^ Fig.^a) shows the 
extant languages A, B, C, D, along with the common ancestor E oi A and B, the 
common ancestor F of C and D, and the common ancestor R (for "root") of E 
and F. The time line to the right of the tree shows that the ancestors of A and B 
diverged around 300 CE, the ancestors of C and D diverged around 800 CE, and 
the ancestors of E and F diverged around 1000 BCE. 

Sometimes languages inherit their characteristics from their ancestors, and some- 
times they trade them with other languages when two linguistic communities are 
in contact with each other. The directed graph in Fig.QJb), obtained from the tree 
in Fig. ^a) by inserting two vertices and adding a bidirectional edge, shows that 
the ancestor Bl of B, spoken around 1400 CE, was in contact with the ancestor 



This idea has led Nakhleh, Ringe and Warnow l)2005|l to the definition of a phy- 

'http: //www . tcs .hut . f i/Sof tware/smodelsTl ■ 



^ We understand genetic relationships between languages in terms of linguistic "descent": A 
language y of a given time is descended from a language X of an earlier time if and only if X 
developed into Y by means of an unbroken sequence of instances of native-language acquisition 
by children. 
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logenetic network, which extends the definition of a phylogeny by allowing lateral 
edges, such as the edge connecting Bl with Dl.'^ The modification of their mathe- 
matical model proposed below takes into account the fact that two languages cannot 
be in contact unless they are spoken at the same time. Geometrically speaking, ev- 
ery lateral edge has to be horizontal. For instance, in Fig. there can be no 
contact between an ancestor of E and a descendant of F, although such contacts 
are not prohibited in the definition of a phylogenetic network. To express the idea 
of a chronologically possible network in a precise form, we introduce "temporal 
networks" — networks with a "date" assigned to each vertex. Dates monotonically 
increase along every branch of the phylogeny, and the dates assigned to the ends of 
a lateral edge are equal to each other. ^ 

Once dates are assigned to the vertices of a phylogeny, we can talk not only about 
the languages that are represented by the vertices, but also about the "intermedi- 
ate" languages spoken by members of a linguistic community at various times. In 
the example above we would represent the language spoken by the ancestors of the 
linguistic community A at time t by the pair A^, where 300 < t < 2000. This pair 
can be visualized as the point at level t on the edge leading to A. In our idealized 
representation, t ranges over real numbers, so that the set of such pairs is infinite. 
Language Bl in Fig. ^b) can be denoted by i3tl400, and Dl can be written as 
£111400. 

The characteristics of a language that it could inherit from ancestors or trade with 
other languages are called (qualitative) characters. A phylogeny describes every leaf 
of the tree in terms of the values, or "states," of the characters. For instance, zeroes 
and ones next to A, B, C and D in Fig. ^a) represent the states of a 2-valued 
character. They can indicate, for example, that a certain meaning is expressed by 
cognates^ in languages A and C (cognation class 0), and that it is also expressed 
by cognates in languages B and D (cognation class 1). 

The main definition in this paper (similar to Definition 12.1.3 from IjNakhleh 2004|l ) 
is that of a "perfect" temporal network. A perfect network explains how every state 
of every qualitative character could evolve from its original occurrence in a single 
language in the process of inheriting characteristics along the tree edges and trad- 
ing characteristics along the lateral edges of the network. For instance. Fig. ^b) 
extends the phylogeny in Fig. ^a) to a perfect network by labeling the inter- 
nal vertices of the graph. Indeed, consider the subgraph of Fig. ^h) induced by 

* It was once customary to oppose a "tree model" of language diversification, in which languages 
speciate relatively cleanly to produce a definite phylogenetic tree, and a "wave model" , in which 
dialects evolve in contact, sharing innovations in overlapping patterns which are inconsistent 
with a phylogenetic tree (see, e.g., iHock iQSBl pages 444-455) with references on page 667). But 
active researchers have long recognized that each model is appropriate to a variety of different 
real- world situations (cf. the discussion of Ross 119971 1. It therefore makes sense to explore 
models that incorporate the strengths of both, such as tree models which incorporate edges 
representing contact episodes between already diversified languages. 

^ These two assum ptions imply th e "weak acyclicity" condition from the definition of a phyloge- 
netic network in INakhleh 20041 Section 12.1). 

® Note that we here use the term 'cognates' as a cover term for true cognates (inherited from 
a common ancestor) and words shared because of borrowing. There does not seem to be a 
convenient term that covers both types of cases. 
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the set {A, C, E, F, R} of the vertices that are labeled 0. This subgraph is a tree 
with the root R; this fact shows that state has evolved in this network from its 
original occurrence in R. Similarly, the subgraph of Fig. ^b) induced by the set 
{B, Bl, D, Dl} of the vertices labeled 1 contains a tree with the root Bl, and also 
a tree with the root Dl. This means that state 1 could evolve from its original 
occurrence in language Bl (or in an ancestor of Bl that is younger than E). Al- 
ternatively, state 1 could evolve from its original occurrence in language Dl, or in 
an ancestor of Dl that is younger than F. 

The computational problem that we are interested in is the problem of recon- 
structing the temporal network representing the evolution of a language family, 
such as Indo-European languages. This problem can be divided into two parts: gen- 
erating a "near-perfect" phylogeny, and then generating a small set of additional 
horizontal edges that turn the phylogeny into a perfect network. (In a plausible 
conjecture about the evolution of languages the number of lateral edges has to be 
small: inheriting characteristics of a language from its ancestors is far more probable 
than acquiring them through borrowing, unless the dataset includes a large pro- 
portion of words that are highly likely to be borrowed.) The first part — generating 
phylogenies — has been the subject of extensive research; applying answer set pro- 
gramming to this problem is discussed in IjBrooks et al. 2005 ^1. In this paper we 
address the second part of the problem — turning a phylogeny into a perfect net- 
work. 

As to the dates assigned to the vertices of the phylogeny, we assume that they 
are known approximately. For instance, about the graph from Fig. ^a) we may 
only know that language E was spoken between 100 BCE and 500 CE, and that 
language F was spoken between 600 CE and 1100 CE. Since these two intervals 
do not overlap, we can conclude from these assumptions, just as in the case of the 
specific dates assigned to E and F in Fig. da) , that a contact between an ancestor 
of E and a descendant of F would be impossible. If, however, the given temporal 
intervals for E and F were wider then such a conclusion might not be warranted, 
and a contact between an ancestor of E and a descendant of F might become an 
acceptable conjecture. 

Our method allows us to turn a phylogeny, along with temporal intervals assigned 
to its vertices, into a perfect network by adding a small number of lateral edges, or 
to determine that this is impossible. 

In this paper, after describing the problem mathematically in Sectional we show 
how it can be solved using ideas of answer set programming and constraint logic 
programming (Section |2Jl, and then apply our approach to the problem of recon- 
structing the evolutionary history of Indo-European languages (Section 0J . 



2 Problem Description 

In this section we show how the problem of computing perfect temporal phylogenetic 
networks built on a given temporal phylogeny can be described as a graph problem. 
Recall that a rooted tree is a digraph with a vertex of in-degree 0, called the root, 
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Fig. 2. A temporal phylogeny (a), and a perfect temporal network (b) with a lateral 
edge connecting Btl750 with DflTSO. 

such that every vertex different from the root has in-degree 1 and is reachable from 
the root. In a rooted tree, a vertex of out-degree is called a leaf. 



A phylogeny is a finite rooted tree (V, E) along with two finite sets / and S and a 
function / from L x I to S, where L is the set of leaves of the tree. The elements 
of / are usually positive integers ("indices") that represent, intuitively, qualitative 
characters, and elements of S are possible states of these characters. The function 
/ "labels" every leaf v — the extant or most recently spoken language in one of 
the branches — by mapping every index i to the state f{v,i) of the corresponding 
character in that language. 

For instance. Fig. ^a) is a phylogeny with / — {1} and S = {0, 1}. Fig. [5fa) is 
a phylogeny with / — {1, 2} and S = {0, 1}; state /(w, i) is represented by the i-th 
member of the tuple labeling leaf v. 

A temporal phylogeny is a phylogeny along with a function r from vertices of 
the phylogeny to real numbers such that for every edge {u, v) of the phylogeny 
t{u) < t{v). Intuitively, t(v) is the time when language v was spoken. We will 
graphically represent the values of r by placing a vertical time line to the right of 
the tree, as in Fig.^a) and Fig. 12a). 



As discussed in the introduction, a contact between two linguistic communities 
can be represented by a horizontal edge added to a pictorial representation of a 
temporal phylogeny. The two endpoints of the edge are simultaneous "events" in 
the histories of these communities. An event can be represented by a pair v'lt, where 
V \s & vertex of the phylogeny and t is a real number. 



2.1 Temporal Phylogenies 



2.2 Contacts and Networks 
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To make this idea precise, consider a temporal phylogeny T; let V be the set of 
its vertices, R its root, and r its time function. For every v E V \ {R}, let par{v) 
be the parent of v. An event is any pair v]t such that v E V \ {i?} and < is a real 
number satisfying the inequalities 

T{par{v)) <t < t{v). (1) 

Events vlt and v'^t' are concurrent if t — t' . A contact is a set consisting of two 
different concurrent events. 

Any finite set C of contacts defines a ( temporal phylogenetic ) network — a digraph 
obtained from T by inserting the elements v]t of the contacts from C as intermediate 
vertices and then adding every contact in C as a bidirectional edge. We will discuss 
now a simple case that is particularly important for applications, defined as follows. 

We say that a set C of contacts is simple if 

• for every event v}t that belongs to a contact from C, t < t(v), and 

• for every vertex v oiT there exists at most one number t such that v']t belongs 
to some contact from C. 

The first condition expresses that the second inequality in holds as a strict 
inequality, so that for every event v'\t that belongs to a contact from C 

T{par{v)) <t < t{v). (2) 

In other words, it says that the endpoints of all lateral edges are different from the 
vertices of T; each of them subdivides an edge of T into two edges. The second 
condition says that the endpoints of the lateral edges do not subdivide any of the 
edges of T into more than two parts. It is clear, for instance, that the set consisting 
of the single contact 

{BT1400, 1)11400} (3) 
in Fig. n and the set consisting of the single contact 

{Btl750, £IT1750} 

in Fig. 121 are simple. 

If C is simple then the corresponding network can be described as follows. The 
set of its vertices is the union of the set V of vertices of T with the union Vc of 
the contacts from C. Its set Ec of edges is obtained from the set E of edges of T 
in two steps. First, for every event v'lt in Vc we replace the edge {par{v),v) from 
E by its "two halves" — the edges 

{par (v) , v'\t) and {v'\t,v). 

Second, for every contact {u^t, vl't} in C we add a "bidirectional lateral edge" — the 
pair of edges 

{u'\t,v'\t) and (uft, utt). 
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2.3 Perfect Networks 

About a simple set C of contacts (and about the corresponding network {V U 
Vc, Ec)) we say that it is perfect if there exists a function g : iV yjVc) y- I ^ S 
such that 

(i) for every leaf w of T and every i (z I, g{v, i) = f{v, i); 

(ii) for every i G I and every s G S, if the set 

V^, = {xeVUVc : gix,i) = s} 

is nonempty then the digraph {V U Vc,Ec) has a subgraph with the set Vis 
of vertices that is a rooted tree. 

The first condition expresses that the function g extends / from leaves to all ances- 
tral languages of the network. The second condition expresses that every state s of 
every character i could evolve from its original occurrence in some "root" language. 

For instance, Fig. GJb) shows a perfect network obtained from the phylogeny 
of Fig. n^a) by adding one contact, along with labels representing the values of g. 
Similarly, Fig.El^b) shows a perfect network obtained from the phylogeny of Fig.[2Ia) 
along with the values of the corresponding function g. In the last example, state 
of the first character and state 1 of the second character have evolved from the root 
of the given phylogeny; state 1 of the first character has evolved from the common 
ancestor of B and C; the state of the second character could evolve from i3tl750 
or from Dtl750. (Each of these two possibilities corresponds to a subgraph with 
the vertices B, D, _Btl750, Dtl750 that is a rooted tree.) 

2.4 Increment to Perfect Simple Temporal Network 

We are interested in the problem of turning a temporal phylogeny into a perfect 
temporal network by adding a small number of contacts. For instance, given the 
phylogeny in Fig.^a), the single contact JSJ is a possible answer. 

It is clear that the information included in a temporal phylogeny is not sufficient 
for determining the exact dates of the contacts that turn it into a perfect network. 
For instance, if we shift contact Q up or down by a few hundred years and replace 
it, say, by 

{511200,1)11200} 

then the new conjecture about the past of the languages A, B, C, D will not be 
distinguishable from 

To make this idea precise, let us select for each v ^ V\ {R} a new symbol v1, and 
define the summary of a simple set C of contacts to be the result of replacing each 
element i of every contact in C with Thus summaries consist of 2-element 
subsets of the set 

{t;t : V eV\{R}}. 

For instance, the summary of the set of contacts of Fig. QJb) is {{Sf , Dj}}. For 
the set of contacts of Fig. El^b), the summary is the same. Intuitively, w j is a 
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language intermediate between par(v) and v that was spoken at some unspecified 
time between T(par(v)) and t(v). 

An IP STN problem (for "Increment to Perfect Simple Temporal Network") is 
defined by a phylogeny {V, E, I, S, /) and a function 

(^)) 

from the vertices of the phylogeny to open intervals. (In other words, for every 
V dV, Tmin{v) is & real number or — oo, and Tmax{v) is a real number or +00, such 
that Tminiv) < Tmax{v).) A solution to the problem is a set of 2-element subsets of 
V'l that is the summary of a perfect simple set of contacts for a temporal phylogeny 
{V, E, I, S, f, t) such that, for aU w e y, 

Tjmn{v) < t{v) < iv). (4) 

Thus a solution is a summary, rather than a set of contacts itself. On the other 
hand, as discussed in the introduction, an IPSTN problem includes a set of condi- 
tions on a time function, rather than a specific temporal phylogeny. 

Given an IPSTN problem Q and a nonnegative integer fc, we want to find the 
solutions X to Q such that the cardinality of X is at most k. 

3 Computing Solutions 

Our approach to computing solutions is based on their characterization in terms of 
"admissible sets." Whether or not a set X is admissible for an IPSTN problem Q is 
completely determined by the phylogeny of Q; this property does not depend on the 
time intervals {T,nin{v),Tmax{v))- The problem of computing admissible sets lends 
itself well to the use of answer set programming in the spirit of l|Erdem et al. 2003)l . 
Proposition^] below shows, on the other hand, that solutions to Q can be described 
as the admissible sets for which a certain system of equations and inequalities has 
a solution. This additional condition is easy to verify, for each admissible set, using 
a constraint programming system. 

3.1 Solutions as Admissible Sets 

Consider a phylogeny (V, E, I, S, f) with a root R, and a set X of 2-element subsets 
of V^. By Vx we denote the union of all elements of X. By Ex we denote the set 
obtained from E by replacing, for every ufe Vx, the edge {par{v), v) with 

(par(u),u|) and {v^,v) 

and adding, for every element uf } of X, the edges 

(ut,-"!) and {V\,u1;). 

We say that X is admissible if there exists a function g : {V UVx) x I ^ S such 
that 

(i) for every leaf v of the phylogeny and every i G /, g{v, i) = f{v, i); 
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(ii) for every i G I and every s G S*, if the set 

V^, ^ {x eVUVx : gix, i) = s} 

is nonempty then the digraph {V U Vx, Ex) has a subgraph with the set Vis 
of vertices that is a rooted tree. 

In the foUowing proposition, Q is an IPSTN problem defined by a phylogeny 
(y, E, I, S, f) with a root R and a function v i-^ (Tmin{v), Tmaxiv)). 

Proposition 1 

A set X of 2-element subsets of is a solution to Q iff 

(i) X is admissible, and 

(ii) there exists a real- valued function r on U Vx such that 

(a) for every v £ V, 

Tmin{v) < t{v) < T„,ax{v), 

(b) for every v £V \ {R}, 

T{par{v)) < t{v), 

(c) for every element ut of Vx , 

T{par{v)) < t{v1) < t{v), 

(d) for every element {uf, wf} of X, 

t{u1) = t{v1). 

Proof Left-to-right. Assume that AT is a solution to Q, so that there exist a 
real-valued function r on ^ satisfying and a perfect simple set C of contacts 
for the temporal phylogeny (V, E, I, S, /, t) such that X is the summary of C. The 
function from Vc to Vx that maps every event v^t to v'\ is a 1-1 correspondence 
between the two sets. If we agree to identify every event v'\t with its image wf 
under this correspondence then Ec becomes identical to Ex, and the conditions 
on g in the definition of a perfect set of contacts turn into the conditions on g in 
the definition of an admissible set. Consequently (i) follows from the fact that C is 
perfect. To prove (ii), extend r from to V UVx- 

T(wt) = t if v^t G Vc- 

Part (a) follows from I^J; part (b) follows from the definition of a temporal phy- 
logeny; part (c) follows from JJJ; part (d) follows from the definition of a contact. 

Right-to-left. Assume that X satisfies conditions (i) and (ii). Consider the tempo- 
ral phylogeny T that consists of the given phylogeny (V, E, I, S, /) and the function 
T restricted to V. By (a), T satisfies (0J. Let C be the set obtained from X by 
replacing each symbol v1 in every element of X with the event v^t where t = t(w|). 
From (d) we conclude that the elements of C are contacts; by (c), C is simple. It 
is clear that X is the summary of C. The same reasoning as in the first half of the 
proof shows that, in view of (i), C is perfect. 
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3.2 Answer Set Programming 

The idea of answer set programming is to represent a given computational problem 
as a logic program whose answer sets (stable models) IjGelfond and Lifschitz 1988|l 
correspond to solutions. Systems that compute answer sets for a logic program are 
called answer set solvers. System SMODELS with its front-end lparse is one of the 
most widely used answer set solvers today. The system CMODELS is another answer 
set solver, and it uses lparse as its front-end also. This newer system does not 
have its own search engine; it is essentially a compiler that translates the problem 
of computing answer sets into a propositional satisfiability problem (or into a series 
of propositional satisfiability problems), and invokes a SAT solver, such as zchaff,'' 
to perform search. 

Unlike Prolog systems, which expect from the user a program and a query, an 
answer set solver starts computing given a program only. For instance, we can give 
SMODELS the input program 

p(0). 
q(l). 

r(X) :- p(X). 
r(X) :- q(X). 

and it will produce the output 

Stable Model: r(0) r(l) q(l) p(0) 

— the set of all ground queries to which Prolog would respond yes. Given the 
program 

p(0). 
p(l). 

q(l-X) :- p(X) , not q(X) . 

SMODELS responds 
Answer: 1 

Stable Model: q(l) p(l) p(0) 
Answer : 2 

Stable Model: q(0) p(l) p(0) 

This output means, intuitively, that the program can be understood in two ways: 
either q(0) is false and q(l) is true, or the other way around. For this program 
(and for other programs with several answer sets) there is no simple relationship 
between the behavior of Prolog and the behavior of answer set solvers. 

A Prolog program can be viewed as a collection of definitions of predicates. In 
addition to such "defining" rules, lparse programs often include rules of two other 
kinds — "choice rules" and "constraints." For example, 



{p,q,r,s}. 
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is a choice rule. Its answer sets are arbitrary subsets of {p, q, r, s}. Intuitively, this 
rule says: for each of the atoms p, q, r, s, choose arbitrarily whether to include it 
in the answer set. A choice rule may include restrictions on the cardinality of the 
answer set. For instance, the answer sets of 

2 {p,q,r,s> 3. 

are the subsets {p, q, r, s} whose cardinality is at least 2 and at most 3. 

A constraint is, syntactically, a rule with the empty head. The effect of adding 
a constraint to a program is to make the collection of its answer sets smaller — to 
remove the answer sets that "violate" the constraint. For instance, by adding the 
constraint 

:- p, not q. 

to a program we remove its answer sets that include p and do not include q. 

A detailed description of the input language of LPARSE can be found in the online 
manual ( |http : //www . tcs . hut . f i/Sof tware/ smodels/lparse . ps . gz| |. 

3.3 Generating Admissible Sets 

An LPARSE program for generating admissible sets is shown in Fig.|3| Table^shows 
the correspondence between the symbols used in the program and the notation 
introduced in Sections EJ I3.1l and r01 ^ The program should be combined with the 
definition of the domain predicates vertex, e, character, state, f describing the 
given phylogeny. 

The first three lines of the program tell lparse that U and V range over vertices, I 
ranges over characters, and S over states. The vertices of the phylogeny are assumed 
to be integers, and the expression U < V in the program is understood accordingly. 
The verification of condition (ii) from the definition of an admissible set is based 
on the fact that (ii) can be equivalently stated as follows: for every i e / and every 
s E S, if the set Vis is nonempty then there is a vertex Vis G Vis such all elements 
of Vis are reachable from Vis in Vis IIErdem et al. 20031 Proposition 1). 

The algorithm for solving IPSTN problems suggested by the discussion above 
consists of two steps: compute admissible sets by running an answer set solver on 
the program from Fig. and then use a constraint programming system to check, 
for each of these sets X, whether the equations and inequalities from part (ii) of 
the statement of Proposition ^ have a solution in real numbers t{v), v G V U Vx- 

This basic algorithm can be improved using the following observation. Let X be 
a solution to the given IPSTN problem. Consider the numbers t{v) from part (ii) 
of the statement of Proposition ^ Conditions (a) and (c) imply that for every v1 

* The representation of Vt by pre(V) is suggested by the distinction between a "proto" language 
and its "pre-proto" stage in historical linguistics. The term "proto-Germanic," for instance, 
represents a language that was about to split up into Germanic languages, each spoken by 
a different speech community; the speech of the ancestors of the proto-Germanic generation, 
slowly changing all the time, is referred to as pre-proto-Germanic. 
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#domain vertex (U;V). 
#domain character(I) . 
#domain state (S) . 

{x(UU,VV): vertex (UU ; VV) : UU != r: VV != r: UU < VV} k. 

xx(U,V) :- x(U,V) , U < V. 
xx(V,U) :- x(U,V) , U < V. 

v_x(U) :- xx(U,V) . 

y, definition of admissibility, part (i) 

g(V,I,S) :- f(V,I,S). 

1 {g(V,I,SS): state(SS)} 1 :- e(V,U). 

1 {g(pre(V) ,I,SS) : state(SS)} 1 :- v_x(V) . 

y, definition of admissibility, part (ii) 

{root(V,I,S)} :- g(V,I,S) . 

{root (pre (V), I, S)} :- g(pre(V) , I , S) . 

:- root(U,I,S), root(V,I,S), U < V. 
root(U,I,S) , root (pre (V) , I ,S) . 
root(pre(U) ,I,S) , root (pre (V) , I ,S) , U < V. 

reachable (V, I, S) : - root (V, I ,S) . 
reachable (pre (V) , I, S) :- root (pre(V) , I , S) . 

reachable (pre (V) , I, S) :-e(U,V), g(pre (V) , I ,S) , reachable (U, I ,S) . 
reachable (V, I, S) :-v_x(V), g(V,I,S), reachable (pre (V) , I ,S) . 
reachable (V, I, S) :- e(U,V), not v_x(V) , g(V,I,S), reachable (U, I ,S) . 
reachable (pre (U) , I, S) :-xx(U,V), g(pre(U) , I , S) , reachable (pre (V) , I ,S) . 

:- g(V,I,S), not reachable (V, I, S) . 

:- g(pre(V) ,I,S) , not reachable (pre (V) , I , S) . 

Fig. 3. An LPARSE program for generating admissible sets, 
in Vx 

T,nin{par{v)) < T{par{v)) < t{vX) < t{v) < T„iax{v), 

so that v'l belongs to the interval {T,ninipo.r{v)),Tmaxiv))- In view of (d), it follows 
that for every element {iif , uf } of X, the intervals 

V mm {par{u)),T„iax{u)) and {T„un{par{v)),T,riax{v)) 

overlap. Consequently, extending the program from Fig. O by the constraints 

: - x(u, v) . (5) 

for the pairs u, v for which these intervals do not overlap will not lead to the loss 
of solutions. 
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Table 1. Explanation of the symbols used in Fig. 



LPARSE program 


Mathematical notation 


vertex(V) 


vev 


character (I) 


I e I 


state (S) 


seS 


e(U,V) 


(U,V) G E 


r 


R 


pre(V) 


vt 


x(U,V) 


{Ut, VT} G X and U < V 


xx(U,V) 


{UT,VT} G X 


v_x(V) 


vtg Vx 


f (V,I,S) 


/(v,i) = s 


g(V,I,S) 


5(V,I) =S 


root(V,I,S) 


V = vjs 


reachable (V, I, S) 


V is reachable from Vjs 



3.4 Making the Program Tight 

The operation of CMODELS ( |Giunchiglia at al. 2004| ) is based on the fact that the 
answer sets for a program can be described as the models of the program's com- 
pletion that satisfy its loop formulas IjLin and Zhao 2002|l . This process is par- 
ticularly simple in the case when the program is tight, because a tight program 
has no loops, and its set of loop formulas is empty IjErdem and Lifschitz 2003)l . 
IjLee and Lifschitz 20031 Section 5). The difference between tight and non-tight pro- 
grams can be illustrated with a simple example: a program containing the rules 

P :- q,r. 
q :- P,s. 

is not tight, because it contains the loop {p, q}. 

The usual recursive definition of the reachability of a vertex in a digraph is tight 
only when the graph is acyclic. In Fig. |21 the atom reachable (V, I, S) expresses 
that V can be reached from vjs in the subgraph of the network induced by Vis; since 
the network contains cycles, the program in Fig. |21is not tight. 

But we can make this program tight using a transformation somewhat similar to 
the process of tightening described in (|Lifschitz 19961 Section 3.2). The network is 
obtained from a tree by adding at most k bidirectional lateral edges. Consequently, 
the shortest path between any pair of vertices in the network includes at most 
k lateral edges. Consider the auxiliary atoms rj (V,I,S, J), expressing that there 
exists a path from vis to V in Vis that contains exactly J lateral edges (0 < J < k) . 
The predicate r j can be characterized by a tight definition: 

rj (V,I,S,0) :- root(V,I,S) . 
rj(pre(V),I,S,J) :- root(pre(V) ,I,S) . 

rj(pre(V),I,S,J) :- e(U,V), g(pre (V) , I , S) , rj(U,I,S,J). 
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rj(V,I,S,J) :- v_x(V), g(V,I,S), r j (pre (V) , I , S , J) . 
rj(V,I,S,J) :- e(U,V), not v_x(V) , g(V,I,S), rj(U,I,S,J). 
rj(pre(U),I,S,J+l) :- xx(U,V), g(pre(U) ,I,S) , r j (pre (V) , I , S , J) , J < k. 

Then reachable can be defined in terms of r j by the rules 

reachable (V, I, S) :- rj (V,1,S, J) . 
reachable (pre (V) ,1,S) :- rj (pre(V) ,1,S, J) . 

This optimization has a significant eflFect on the computation time of CMODELS. 



4 Contacts Between Indo-European Languages 

We have appfied the concept of a temporal phylogenetic network and the computa- 
tional methods described above to the problem of generating conjectures about con- 
tacts between prehistoric Indo-European languages, discussed earlier in IjNakhleh et al. 20'05jl . 
and also in IjErdem et al. 2003)) and IjNakhleh 20041 Chapter 13). In these experi- 
ments, CMODELS was used as the answer set solver, and ecl'ps'^ as the constraint 
programming system. 

The problems addressed in these experiments are more general than IPSTN prob- 
lems discussed in Sections El and 01 we were interested in sets of contacts that are 
not necessarily simple in the sense of Section lT^ The theory and the computational 
methods presented above have been extended to "IPTN problems" involving the 
networks that may have several lateral edges meeting the same tree edge, and that 
may have lateral edges incident to the vertices of the given phylogeny — the possi- 
bilities ruled out in the definition of a simple set of contacts. The program shown in 
Fig.Ohas been modified accordingly, and it was also optimized by allowing function 
g to be partial, as in l|Erdem et al. 20031 Section 5). 

4-1 A Phylogeny of Indo-European Languages 

As the starting point, we took a phylogeny of Indo-European languages based on the 
"unscreened IE dataset" published at http : //www . cs . rice . edu/^nakhleh/CPHL/ 
(without characters that are uninformative or that exhibit known parallel develop- 
ment of states), and on the genetic tree shown in (fNakhleh et al. 2005l Fig. 5) (pub- 
lished originally in ( |Ringe et al. 20021 )). Using the methods discussed in (jErdem et al. 20031 
Sections 3 and 4), we extracted from that phylogeny a small part that appears to 
contain all components essential for the task of reconstructing contacts between 
prehistoric Indo-European languages. 

The vertices and edges of this smaller phylogeny are shown in Fig. ^ All ver- 
tices except two (Old Church Slavonic and Albanian) are "prehistoric" languages, 
reconstructed by comparing their descendants. For instance, proto-Celtic has been 
reconstructed from what is known about its recorded descendants. Old Irish and 
Welsh (and the fragmentarily attested Continental Celtic languages of antiquity). 
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proto- Old Church Slavonic 
Baltic y 

Albanian 



Fig. 4. A phylogenetic tree of Indo-European languages. The languages that do not 
have commonly accepted names are labeled by numbers. 



Table 2. The essential character states of some lexical characters for the languages 
denoted by the leaves of the phylogeny in Fig. 0] 





'one' 


'arm' 


'beard' 


'free' 


'pour' 


'tear' 


proto-Indo-Iranian 




5 


1 






11 


proto-Baltic 


11 


8 


5 




6 


11 


Old Church Slavonic 






5 




6 




proto-Greco- Armenian 


2 




1 


3 


3 


2 


proto-Germanic 


11 


8 


5 


10 


14 


2 


Albanian 


2 




1 








proto-Italic 


11 




5 


3 


14 


2 


proto-Celtic 


11 






10 




2 


proto- Tocharian 


2 


5 






3 


11 


proto- Anatolian 






1 









The phylogeny has 16 qualitative characters (all lexical), and each character has 2 
or 3 states. Some of the essential character states are shovifu in Table El^ 

^ Let (y, E) be a phylogeny along with a set I of characters, a set S of character states, and a 
function / from L X I to S, where L is the set of leaves of the tree. A state s £ S is essential 
with respect to a character j (H I ii there exist two different leaves li and I2 in L such that 
f{h,j) = f(h,j) = s. 
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Table 3. Time intervals for the languages from Fig.^J 



V 


{j'min (^) ! 


^raax (^)) 


proto- Indo-European 


(-4500, 


-3800) 


proto-Indo-Iranian 


(-2100, 


-1700) 


proto-Balto-Slavic 


(-1400, 


-800) 


proto- Baltic 


(600, 


1000) 


Old Church Slavonic 


(870, 


1000) 








proto-Germanic 


(-400, 


0) 


Albanian 


(1800, 


2100) 


proto-Italo-Celtic 


(-3000, 


-2400) 


proto-Italic 


(-1500, 


-1000) 


proto-Celtic 


(-700, 


-300) 


proto- Tocharian 


(-700, 


-300) 


proto- Anatolian 


(-2500, 


-2100) 


Vertex 28 


(-3900, 


-3300) 


Vertex 29 


(-3600, 


-3000) 


Vertex 30 


(-3500, 


-2900) 


Vertex 31 


(-2400, 


-1800) 


Vertex 39 


(-3400, 


-2800) 


Vertex 41 


(-2600, 


-2200) 



Table 131 shows, for each vertex of the tree, our assumptions about the time when 
the corresponding language was spoken. Our calculations assume, for instance, that 
proto-Indo-Iranian was spoken by a generation that lived between 2100 BCE and 
1700 BCE. 

Estimating the dates of prehistoric languages is a matter of informed guesswork, 
because rates of linguistic change are known to vary not only over time but also be- 
tween lineages (see especially JBergsland and Vogt 1962| |). Relevant archaeological 
evidence must be taken into account, but it rarely settles important disputes, be- 
cause the material remains of a culture typically reveal nothing about the language 
(or languages) spoken, in the absence of written documents. The dates suggested 
here for internal nodes of the IE tree are estimates and are presented with consider- 
able diffidence. For a good summary and discussion of the archaeological evidence 
the reader is referred to ( |Mallory 1989| ). 

Some solutions in the sense of Section[21do not represent viable conjectures about 
the evolution of Indo-European languages for geographical reasons. For instance, 
a contact between pre-proto-Celtic and pre-proto-Baltic is unlikely because the 
former was spoken in western Europe, while the Baits were probably confined to 
a fairly small area in northeastern Europe. We have eliminated several unrealistic 
possibilities of this kind at the stage of computing admissible sets, by including 
additional constraints of the form Q. For instance, the possibility above can be 
eliminated by adding to the lparse program that generates admissible sets the 
constraint: 



Temporal Phylogenetic Networks and Logic Programming 17 




proto- Old Church Slavonic 
Baltic y 

Albanian 



Fig. 5. A conjecture about contacts between Indo-European languages generated 
by CMODELS and accepted by ecl'ps". 



:- x(38,43) . 

where 38 denotes proto-Celtic and 43 denotes proto-Baltic. 



4.2 Results 

The problem described in Section im with the additional geographical constraints 
mentioned above, turns out to have no solutions consisting of fewer than 3 contacts. 
There are three solutions of cardinality 3. (To be precise, we should say "three 
essentially different solutions," because a summary does not specify the exact times 
of contacts.) The first (Fig.jSl involves contacts between 

pre-Old Church Slavonic and prc-proto-Tocharian, 
pre-proto-Germanic and pre-proto-Celtic, 
pre-proto-Balto-Slavic and pre-proto-Celtic; 



the second, contacts between 

pre-Old Church Slavonic and pre-proto-Tocharian, 
pre-proto-Germanic and pre-proto-Italic, 
pre-proto-Italic and pre-proto-Balto-Slavic; 



the third, contacts between 
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pre-proto-Italic and pre-proto-Greco- Armenian, 
pre-proto-Germanic and pre-proto-Italic, 
pre-proto-Baltic and pre-proto-Germanic. 

All three summaries generated by CMODELS have been accepted by the ecl'ps'^ filter 
as solutions (which means that all relevant chronological information was expressed 
in this case by the constraints shown at the end of Section I3.3|l . They have been 
computed in about 40 minutes of CPU time using lparse 1.0.13, CMODELS 2.10, 
ZCHAFF Z2003. 11.04, and ecl'ps'= 3.5.2, on a PC with a 733 Intel Pentium III 
processor and 256MB RAM, running SuSE Linux (Version 8.1). 

We have also determined, using CMODELS, that there exist 193 admissible sets of 
cardinality 4 that are minimal with respect to set inclusion; out of those, 14 have 
been rejected by ecl'ps°. Some of the 4-edge solutions represent plausible conjec- 
tures about the history of Indo-European languages. One such solution includes, 
for instance, contacts between 

pre-Old Church Slavonic and pre-proto-Tocharian, 
pre-proto-Germanic and pre-proto-Italic, 
pre-proto-Germanic and pre-proto-Celtic, 
pre-proto-Germanic and pre-proto-Baltic. 

4-3 Comparison with Earlier Work 

The three 3-edge solutions listed in Section are identical to the solutions that 
are marked as "feasible" in Ip^khleh et al. 20051 Table 3). That table shows the 
16 sets of lateral edges generated by MIPPN, the software tool designed for solving 
the Minimum Increment to Perfect Phylogenetic Network problem. It is different 
from the computational problem that we solve here using logic programming tools 
in that its input does not include any chronological or geographical information. 
The 16 sets of contacts produced by MIPPN were scrutinized by a specialist in the 
history of Indo-European languages, who has determined that most of them are not 
plausible from the point of view of historical linguistics. Then the remaining 3 sets 
were declared feasible. The logic programming approach, on the other hand, allowed 
us to express the necessary expert knowledge about chronological and geographical 
constraints in formal notation, and to give this information to the program as part 
of input, along with the phylogeny. All "implausible" solutions were weeded out in 
this case by CMODELS without human intervention. 

In the experiments described in IjErdem et al. 2003jl , chronological and geograph- 
ical information was not part of the input either. But those experiments were similar 
to the work described in this paper in that search, in both cases, was performed us- 
ing answer set solvers: SMODELS in l|Erdem et al. 2003| . and CMODELS with zchaff 
in this project. The difference in the computational efficiency between the two en- 
gines turned out to be significant. With the new tools available, we did not have 
to employ the "divide-and-conquer" strategy described in l|Erdem et al. 20031 Sec- 
tion 6). The time needed to compute the 3-edge solutions went down from over 150 
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hours to around 40 minutes. For comparison, the computation time of MIPPN in 
the same apphcation was around 8 hours IjNakhleh et al. 20051 Section 5.3). 

5 Conclusion 

The mathematical model of the evolutionary history of natural languages proposed 
in IjNakhleh et al. 2005|l enriched the traditional "evolutionary tree" model by al- 
lowing languages in different branches of the tree to trade their characteristics. In 
that theory, phylogenetic networks take place of trees. In this paper we discussed 
a further enhancement of the phylogenetic network model, which incorporates a 
real-valued function assigning times to the vertices of the network and prohibits a 
contact between two languages if it is chronologically impossible. The use of the 
time function allows us to reduce the number of networks that are mathematically 
"perfect" but do not represent historically plausible conjectures. 

Computing perfect temporal networks can be accomplished by a combination 
of an answer set programming "generator" with a constraint logic programming 
"filter." An alternative approach to combining computational methods developed 
in these two subareas of logic programming is discussed in l|Elkabani et al. 2004|l . 

In application to the problem of computing perfect networks for a phylogeny of 
Indo-European languages, the use of CMODELS with zchaff has improved the com- 
putation time by two orders of magnitude in comparison with the use of SMODELS 
in earlier experiments. 
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